Combination of SPLICE and Feature Normalization for Noise Robust Speech Recognition
نویسندگان
چکیده
It is well-known that the performance of automatic speech recognition (ASR) systems are easily affected by acoustic mismatch between training and testing conditions. This mismatch is often caused by various kinds of environmental noise or distortion. To reduce the effect of mismatch, feature normalization, feature enhancement, model adaptation, etc. have been studied intensively. Cepstral mean normalization (CMN), mean and variance normalization (MVN) and histogram equalization (HEQ) are well-known methods of feature normalization. Stereo-based piecewise linear compensation for environments (SPLICE) is one of the feature enhancement methods. In this paper, we describe how to combine these methods to effectively improve the robustness of ASR systems. In the experiments performed on the Aurora-2 database, a good combination showed a 41% improvement in word error rate over SPLICE only, and a 25% improvement over the conventional combination of SPLICE and CMN.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملA recursive feature vector normalization approach for robust speech recognition in noise
The acoustic mismatch between testing and training conditions is known to severely degrade the performance of speech recognition systems. Segmental feature vector normalization [8] was found to improve the noise robustness of MFCC feature vectors and to outperform other state-of-the-art noise compensation techniques in speaker-dependent recognition. The objective of feature vector normalization...
متن کاملFeature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition
Reverberant noise has been a major concern in speech recognition systems. Many speech recognition systems, even with state-of-art features, fail to respond to reverberant effects and the recognition rate deteriorates. This paper explores the significance of normalization strategies in reducing statistical mismatches for robust speech recognition in reverberant environment. Most normalization wo...
متن کاملThe dependence of feature vectors under adverse noise
The performance degradation of automatic speech recognition system due to acoustic mismatch in training and testing environment is a severe problem for practical use of speech recognizer [1]. In this paper, we explore the effects of noise on individual speech feature vector statistics, and several feature normalization methods are used to compensate environment influence on feature vectors. We ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012